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ABSTRACT 

A study was designed to investigate the effectiveness of a 
preservice teacher evaluation scheme used in the field experience component 
of the M.Ed. program in Mathematics, Science, and Technology Education at the 
Ohio State University. Subjects were student teachers (n=34), mentor teachers 
(n=34) and university-based supervisors (n=6) . Student teachers* performances 
were assessed independently by student teacher, mentor, and supervisor, at 
two intervals {midterm and final three-way conferences) . Data were collected 
in the form of Intern Evaluation Worksheets, as well as observations of 
supervisor meetings. Data from the worksheets were analyzed using 2-way 
ANOVA. Results indicated significant differences in evaluation scores among 
evaluation groups in the midterm conference, as well as between the two 
evaluation periods. Findings aid in better understanding the dynamics taking 
place during the three way conference evaluations, as well as form a basis 
for transition to new ways of teacher performance assessment. (Author) 
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Abstract 

A study was designed to investigate the effectiveness of a preservice teacher evaluation 
scheme used in the field experience component of the M.Ed. program in Mathematics, 
Science, and Technology Education at the Ohio State University. Subjects were student 
teachers (n=34), mentor teachers (n=34) and university-based supervisors (n=6). Student 
teachers’ performances were assessed independently by student teacher, mentor, and 
supervisor, at two intervals (midterm and final three-way conferences). Data were 
collected in the form of Intern Evaluation Worksheets, as well as observations of 
supervisor meetings. Data from the worksheets were analyzed using 2-way ANOVA. 
Results indicated significant differences in evaluation scores among evaluation groups in 
the midterm conference, as well as between the two evaluation periods. Findings aid in 
better understanding the dynamics take place during the three way conference 
evaluations, as well as form a basis for transition to new ways of teacher performance 
assessment. 



Introduction 

“ Preservice teachers get their first major opportunity to test their teaching 
skills when they student teach. The development of perceived 
teaching adequacies during the student teaching experiences 
should be an affective predictor of future success. ” (Wood and Etcher, 1989). 

The effectiveness of novice teachers can be estimated by their performance in the 
field experiences, if the performance assessments used in these time periods are 
compatible. Currently, evaluations of both first year teachers and preservice teachers 
show a great deal of variation. In response to the concerns about these variations as well 
as the quality of teaching; Carnegie forum and Holmes group declared recommendations 
(Lucas, 1997) to the educational community, which resulted in the standards for teaching 
and for the preparation of teachers. These standards have served as a framework with the 
goals of increasing the quality of inservice teachers but also greatly impacting the teacher 
performance evaluation process. (Yinger, 1999). One second-generation example of this 
impact is the Classroom Performance Assessment Test, also known as Praxis III; through 
which entry-year teachers will be assessed. The state of Ohio declared that it would be 
the first to implement this system statewide beginning in 2003. (Ohio Department of 
Education, 2000.) 

The current practice of teacher performance evaluation in teacher education 
programs involves two types of evaluations. Summative evaluations are used to make 
judgmental decisions about the quality of teachers’ performances. This type of evaluation 
is used for accountability and to determine if a teacher meets minimum standards. 
(Dagley and Orso, 1991). Formative evaluations on the other hand, are ongoing processes 
and are used to promote teacher growth by improving teacher performance. This type of 
evaluation requires supportive partnerships, which can provide feedback to teachers for 
making decisions about how they can improve their teaching. 

The supervisory evaluations (formative) of preservice teachers in teacher 
education programs are typically conducted by supervisors and mentor teachers. In the 
last decade with the increase use of multiple source evaluations, the inclusion of student 
teachers in this process is becoming more and more common. The benefits of using this 
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type of evaluation is stated by Dyer as “ The fundamental premise of multiple source 
evaluation is that data gathered from multiple perspectives are more comprehensive and 
objective than data gathered from only one source. “ (Dyer, 1991, p.35) However, this 
type of multiple source evaluation and feedback has some drawbacks. Despite the desire 
of the parties involved, the necessary time and effort to build these supportive 
partnerships are the two major disadvantages of the process. For example after classroom 
observations, neither the student teacher nor the supervisor have adequate time reflect on 
the lesson taught. Furthermore, mentor teachers may be unclear about the expectations of 
their role and often express concerns about the insufficient communication and support 
form the university. Moreover, some mentor teachers still consider the supervisors in the 
role of an inspector rather then a part of a collaborative effort. (Bolin and Panaritis, 
1992). Therefore, while building these collaborative partnerships that will be active in 
multisource evaluations all of these drawbacks need to be taken into consideration. 

Performance evaluations (summative), which are the key components of the 
supervisory evaluation is typically conducted both during and at the end of the 
supervisory process. Active involvement of both the mentor teacher and the supervisor is 
desirable in this process. However, in practice, the input of mentor teachers in the 
decision making stages is often so minimized that the evaluation becomes a sole 
judgment of the supervisor, a situation which has been seriously questioned in the last 
decade. (Rust, 1 992). 

During preservice teacher evaluation, the interaction between supervisory and 
performance evaluations are so intricate that it is hard to think one separately from the 
other. Hazi (1994) advocates combining these two evaluation types by stating, 
“disentangling the supervision-evaluation knot is impossible”. Furthermore, Hunter 
(1988) views formative and summative evaluations as sequential processes, which cannot 
be conducted separately. Since the interactions between these two evaluations are so 
strong and the processes are so compatible then the use of multiple source evaluation 
should be considered in performance evaluations as well. Even the minimum 
involvement of mentors as well as the student teachers in the decision-making process 
will overweigh the disadvantages of not including them at all. 

Active participation of mentors, supervisors, and student teachers in both 
supervisory processes as well as decision-making processes could benefit each party in 
multiple ways. It could give mentor teachers clearer descriptions of their roles and on the 
way could help them to change their supervisory mindset away from supervisor as an 
inspector. Opportunities for student teachers to jointly think through about their teaching 
with supervisors and mentors as well as to participate in the decision making process 
could lead to greater independence and reflection, a significant goal in most teacher 
education programs. This reflective approach could also help them meet challenges once 
they enter the profession. With increased involvement of mentor teachers, supervisors 
could find more opportunities to interact with the student teachers and help them grow 
professionally. Therefore, in all steps of preservice teacher evaluation, the use of multiple 
source evaluation will not only improve the engagement of the parties involved and 
provide each benefits, but will also increase the likelihood of obtaining a comprehensive 
picture of personal performance of the person being evaluated. 

The teacher preparation program at OSU uses a triadic supervision model within 
which student teachers’ teaching competencies are assessed during field experiences. The 




2 4 



performance assessment conducted during the three-way conferences each quarter 
involves participation and input from the student teacher, mentor and the supervisor. Both 
mid- and final evaluations rely on effective group dynamics and performance. 

The main purpose of this study is to examine the variance existing among these three 
evaluation groups (student teachers, mentors, and supervisors), as well as the variance 
between the two evaluation periods (mid- and final evaluation). The questions that 
shaped this study are: 

1 . Do the means of student teacher performance scores differ significantly among 

evaluation groups (student teacher, mentor, and supervisor)? 

2. Do the means of student teacher performance scores differ significantly between 

the two evaluation periods (mid and final evaluation)? 

3. Is there any interaction between the evaluation group and evaluation period? 

Methodology 

The study was conducted during the winter quarter of 2001 as the student teachers 
began taking responsibility for teaching. The data were gathered from the evaluation 
instrument currently used in the program that has Likert type performance scales related 
to an inventory of teaching parameters. Test-retest reliability of the instrument was 0.78. 
The specific performance items on the instrument derived from the following sources: a) 
previous evaluation inventories used at OSU, b) the experiences of the supervisors, c) the 
comments of the mentor teachers, d) suggestions from student teachers, and e) related 
literature. The instrument also included commentary sections for the parties to reflect on 
student teachers’ performances. Due to the concerns from student teachers, these sections 
of the instrument were not included in the study. 

This instrument formed the basis for discussion sessions among student teachers, 
mentors, and supervisors. In these conferences, the student teacher was independently 
evaluated by all three parties (self, mentor, and supervisor) through a process of reporting 
and discussing evaluative ratings, and future goals were set for professional development 
of the student teacher. The instrument was administered twice during the student teaching 
period. The first administration (mid-evaluation) was conducted during the fifth week; 
and the second administration (final evaluation) was conducted during the tenth week of 
the quarter. 

In addition to the three-way conferences, the program offered multiple 
opportunities to reinforce the group dynamics between the parties. The student teaching 
experience was supported by a weekly professional seminar that lasted for the entire 
quarter, with each session followed by supervisor-student teacher group discussions. 
Furthermore, supervisors met regularly to discuss concerns and ensure consistency in the 
supervision/evaluation process. Researchers attended these meetings to observe and 
record field notes, in order to more fully understand the process and enable better 
interpretation of the results. 

The sample was 37 student teachers, 37 mentors, and 8 supervisors. The 
instruments of the parties who gave consent to the study were collected after the three- 
way conferences and used for analysis. 

The data then were analyzed by using two-way ANOVA to investigate the effects 
of evaluation group and evaluation time on the student teachers’ performances. 
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Furthermore, one-way ANOVA was conducted in order to examine group differences 
both in mid- and final evaluation periods. 

Results and Discussion 

A 3x2 (Evaluation group x Evaluation period) factorial analysis of variance was 
conducted on the evaluation scores. All statistical tests were conducted at the a= .05 level 
of significance. Table 1 contains means and standard deviations for performance scores 
by groups (student teacher, mentor, and supervisor) in two different time periods (mid 
and final). It can be observed that the group means increased from 4.94 to 5.29 from mid 
to final evaluations. 

TABLE 1 

Means and Standard Deviations for Teacher Performance Scores, by Evaluation Groups in Mid and Final 



Evaluation Periods 



Evaluation Group 


M 


Mid Evaluation 
SD 


n 


M 


Final Evaluation 
SD 


n 


Student Teacher 


4.53 


.53 


21 


5.10 


.47 


22 


Mentor 


5.16 


.67 


23 


5.32 


.81 


23 


Supervisor 


5.07 


.45 


28 


5.41 


.39 


28 


Total 


4.94 


.61 


72 


5.12 


.58 


73 



The main effects of evaluation groups and evaluation period as well as their interaction 
are presented on Table 2. 

TABLE 2 

Analysis of Variance for Teacher Evaluation Scores 



Source 


ss 


df 


MS 


F 


Evaluation Group (G) 


4.57 


1 


4.57 


14.42* 


Evaluation Time (T) 


5.47 


2 


2.74 


8.64* 


GXT 


.99 


2 


.49 


1.56 


Error 


44.00 


139 


.32 





*E< .05 



According to the results, no significant evaluation period and group interaction 
was found. On the other hand, the main effects of evaluation groups and evaluation 
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period were significant at a=. 05. The post-hoc comparison (Tukey, HSD) showed that 
scores of student teachers were significantly different from both scores of mentors and 
supervisors (Mean differencestudem teacher-mentor= -42, p< .05; Mean differencestudem teacher- 
supervisor= -42, p< .05). However, there was no significant difference between mentor and 
supervisor scores. 

Further analysis of the evaluation times by using one-way ANOVA revealed that 
at the mid-evaluation, the performance scores of mentors and supervisors significantly 
differed from those of student teachers (F (2, 72)=8.611, p< .05). Flowever, no 
significant difference was observed among groups in the final three-way conference (F 
(2, 73)= 1.764, p< .05). 

The comparisons of the midterm and final evaluations indicated an improvement 
in performance scores as well as a decreased variability among evaluation groups. It is 
highly possible that as time progresses, the collaboration among parties increases. This 
collaboration may be characterized by a “landmarking effect” of the midterm 3-way 
conference. This being the first time that these 3 parties have engaged in this important 
activity, the midterm conference became the first occasion upon which a common 
evaluation scale is applied to a student teacher’s performance, and also the first time that 
ordinal values were assigned to the performance by each party. In many ways, then, this 
was the first occasion for the student teacher to see and hear how valued others viewed 
their performance on a number of dimensions of teaching practice during this placement. 
Likewise, this conference was the first occasion for mentor and supervisor to state these 
values and to hear how they compared to the values stated by the two others. In many 
ways there was a recursive feedback loop that involved stating an evaluation, hearing 
others’ judgments, internalizing consonance or dissonance between the values, discussing 
perceptions as justifications, and coming to agreement on a value that served as a 
baseline for the student teacher’s professional growth during the second half of the 
placement (and thus for the final three-way conference). In this way, the midterm 
conferences were seen to enable increased collaboration by “landmarking” for all parties 
the performance of the student teacher on various dimensions of teaching reflected on the 
Intern Evaluation Worksheet. 

A second finding, that student teachers evaluated themselves lower than both 
mentors and supervisors, was also discussed during the supervisor meetings. There, 
supervisors commented on the insufficient teaching experiences of student teachers. The 
reason for this difference, in our view, may be related to the description of certain stages 
that student teachers go through described by Fuller (1969). The first stage Fuller 
identified is the self-stage, where the student teacher is mostly concerned with self- 
oriented pre-occupations. A second stage, called the task stage, is characterized by the 
student teacher’s focus on how to conduct tasks that surround teaching. These stages 
form the developmental processes of both pre-service and novice teachers. Fuller 
proposed that experienced teachers attain the third stage, in which their primary concern 
is the impacts of their teaching on student learning. The increase in student teacher 
evaluations from midterm to final seems to mirror progression in the developmental 
stages of student teachers described by Fuller. 
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Implications 

The results of this study enable the teacher education program at this university to 
better understand the teacher evaluation scheme and how it works. The evaluation 
scheme is based on a collaborative model that places emphasis on the mentor-student 
teacher relationship, and the university supervisor plays a supporting role to this 
relationship. In this collaborative model, student-teacher self-evaluation is highly valued, 
and in the two evaluation conferences, student teachers speak first, followed by mentor 
and supervisor. The data from this study give strong indications of some of the dynamics 
at play in these conferences. These indications may be useful to other teacher educators 
who use a collaborative model. 

This study suggests that the triadic supervision model is an effective tool in 
obtaining a comprehensive picture of student teacher performance. Therefore, multiple 
source evaluations are recommended (with the active participation of student teachers, 
mentors and supervisors) for those in the science education community who are involved 
in preservice teacher education. 

As school districts in Ohio transition to the Praxis III entry-year teacher 
performance evaluation, it is essential that pre-service teacher education programs 
become informed about the new system. They must transition to new teacher evaluation 
schemes, so that graduates of these programs are better supported in the entry years, 
when many energetic and talented people leave the profession. This study provides a 
baseline upon which to base the transition to Praxis Ill-style evaluation, linking our past 
with our future. 
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